Monitoring Multiple Data Streams in Real Time
نویسندگان
چکیده
Online monitoring of data streams poses a challenge in many data-centric applications, such as telecommunications networks, traffic management, trend-related analysis, webclick streams, intrusion detection, and sensor networks. Mining techniques employed in these applications have to be efficient in terms of space usage and per-item processing time while providing a high quality of answers to (1) aggregate monitoring queries, such as finding surprising levels, i.e., “volatility” of a data stream, and detecting bursts, and to (2) similarity queries, such as detecting correlations and finding similar patterns. We propose a framework for summarizing a set of data streams, and for constructing a composite index structure in order to answer the above types of user queries. In our technique, features of streams are extracted incrementally on the fly at multiple resolutions, and inserted into a family of dynamic index structures. We demonstrate the effectiveness of our method over existing techniques through an extensive set of experiments. In answering aggregate queries, our false alarm rate is up to 400 times lower than current solutions. In the case of pattern analysis, our technique offers more than two times better accuracy while minimizing the space required for incremental computation. In detecting correlations, our technique performs up to 60 times Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct commercial advantage, the VLDB copyright notice and the title of the publication and its date appear, and notice is given that copying is by permission of the Very Large Data Base Endowment. To copy otherwise, or to republish, requires a fee and/or special permission from the Endowment. Proceedings of the 30th VLDB Conference, Toronto, Canada, 2004 better in response time, and up to 20 times better in terms of the quality of answers provided.
منابع مشابه
Real-time quality monitoring in debutanizer column with regression tree and ANFIS
A debutanizer column is an integral part of any petroleum refinery. Online composition monitoring of debutanizer column outlet streams is highly desirable in order to maximize the production of liquefied petroleum gas. In this article, data-driven models for debutanizer column are developed for real-time composition monitoring. The dataset used has seven process variables as inputs and the outp...
متن کاملReal time contextual collective anomaly detection over multiple data streams
Anomaly detection has always been a critical and challenging problem in many application areas such as industry, healthcare, environment and finance. This problem becomes more di cult in the Big Data era as the data scale increases dramatically and the type of anomalies gets more complicated. In time sensitive applications like real time monitoring, data are often fed in streams and anomalies a...
متن کاملLIVE: Semantic-based Multi-Stream Broadcasting of Media Events
Broadcasting of media events is a real-time action demanding reliable just in time decisions based on the current content of incoming video streams and the availability of background material. Multi-stream broadcasting of this type of event thus demand monitoring of multiple streams and background material. Due to the potentially large amount of streams and other available material, manual moni...
متن کاملReplication Schemes to Support Failure Resilient Processing of Real Time Data Streams
In this paper we explore the use of replication for fault tolerant processing of streams. We perform these experiments in the context of the Granules stream processing system that is designed for real time processing of data streams generated by devices and instruments. In this paper we explore well-known replication schemes for fault tolerant processing of data streams. We analyze two basic ap...
متن کاملMining Frequent Patterns in Uncertain and Relational Data Streams using the Landmark Windows
Todays, in many modern applications, we search for frequent and repeating patterns in the analyzed data sets. In this search, we look for patterns that frequently appear in data set and mark them as frequent patterns to enable users to make decisions based on these discoveries. Most algorithms presented in the context of data stream mining and frequent pattern detection, work either on uncertai...
متن کاملLeveraging Complex Event Processing for Grid Monitoring
Currently existing monitoring services for Grid infrastructures typically collect information from local agents and store it as data sets in global repositories. However, for some scenarios querying real-time streams of monitoring information would be extremely useful. In this paper, we evaluate Complex Event Processing technologies applied to real-time Grid monitoring. We present a monitoring ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2003